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The process of polymerizing a protein by a ribosome, using a messenger RNA (mRNA) as the 
corresponding template, is called translation. Ribosome may be regarded as a molecular motor for 
which the mRNA template serves also as the track. Often several ribosomes may translate the same 
(mRNA) simultaneously. The ribosomes bound simultaneously to a single mRNA transcript are 
the members of a polyribosome (or, simply, polysome). Experimentally measured polysome profile 
gives the distribution of polysome sizes. Recently a breakthrough in determining the instantaneous 
positions of the ribosomes on a given mRNA track has been achieved and the technique is called 
, ribosome profiling [J Q]. Motivated by the success of these techniques, we have studied the 

^jrj. spatio-temporal organization of ribosomes by extending a theoretical model that we have reported 

elsewhere Qj. This extended version of our model incorporates not only (i) mechano-chemical 
cycle of individual ribomes, and (ii) their steric interactions, but also (iii) the effects of (a) kinetic 
proofreading, (b) translational infidelity, (c) ribosome recycling, and (d) sequence inhomogeneities. 
QO ■ The theoretical framework developed here will serve in guiding further experiments and in analyzing 

the data to gain deep insight into various kinetic processes involved in translation. 
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Ribosome is a macromolecular complex and operates as one of the essential intracellular machines [7] that 



participate in gene expression in all living cells @, Q . More specifically, it polymerizes a protein that is a linear hetero- 
polymer consisting of amino-acid monomers each of which is linked to the next one by a peptide bond. Therefore, 
i— growing protein is also called a polypeptide. For the synthesis of a protein, a messenger RNA (mRNA) serves as 
the template; the sequence of the amino acid species in the protein is determined by that of the codons (triplets of 
nucleotides) on the corresponding mRNA template. This process is called translation. Translation by every ribosome 
^-j- , goes through three main stages: (i) initiation, (ii) elongation, and (iii) termination. The start and stop codons mark 

■ the positions on the template mRNA where initiation and termination of translation take place. 
\& \ During the elongation stage, at every codon, the amino acid monomer required for elongating the protein is supplied 
by an incoming tRNA molecule; the correct amino acid monomer is carried by those tRNA whose anti-codon is 
\q | complementary to the codon. The machinery of translation deploys a quality control mechanism which screens 
the incoming tRNA through a multi-step selection process. However, in spite of this stringent selection process, 
occasionally an incorrect amino acid may escape rejection by the quality control system; a translational error results 
if the growing protein incorporates an incorrect amino acid monomer thereby lowering the fidelity of translation. In any 
case, after the termination, a ribosome is partly disassembled. These parts can reach near the start codon by diffusion 
in the surrounding aqueous medium. A ribosome can be assembled more quickly from these parts than from basic 
constituents. Moreover, in case the start and the stop codons are close to each other because of the loop formation by 
the mRNA, diffusive transfer of the parts of the ribosome from the stop codon to the start codon can be quite rapid 
leading to a faster recycling of the ribosomes [Ioj |. Rarely elongation process is aborted because of the premature 
detachment of the ribosome from the mRNA track. Furthermore, often several ribosomes translate the same mRNA 
transcript simultaneously, each polymerizing a distinct copy of the same protein. Because of the superficial similarities 
with vehicular traffic on a given stretch of a highway |llT - [l3| . the simultaneous collective translation of a mRNA by 
several ribosomes is sometimes referred to as ribosome traffic. The ribosomes bound simultaneously to a single mRNA 
transcript are the members of a polyribosome (or, simply, polysome) [T^ - [l7| . Because of the mutual hindrance of 
the ribosomes, the overall rate of protein synthesis is expected to attain a maximum at an optimum mean separation 
between the ribosomes. Finally, the ongoing production and decay of mRNA transcripts and various feedback loops 
in gene expression also control the rate of protein synthesis. 
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It would be desirable to capture all the processes mentioned above within a single theoretical model of translation. 
However, it is extremely unlikely that such a model can be analyzed analytically. Therefore, the aim of this paper 
is more modest. Here we extend our earlier model [3, EH- capturing (i) the mechano-chemical cycle of individual 
ribomes, and (ii) their steric interactions, as well as (iii) the effects of (a) quality-control mechanisms, (b) translational 
error, (c) ribosome recycling, and (d) sequence inhomogeity of the mRNA. 

The overall rate of synthesis of proteins is a key quantity in any model of translation. However, the main focus 
of our theoretical study here are the size of the polysome and the spatial distribution of ribosomes on a mRNA. We 
identify the different parameter regimes of our theoretical model and characterize these in terms of the average density 
of the polysome and the overall average rate of synthesis of proteins from a single mRNA transcript. Moreover, going 
beyond the scope of all the previous theoretical works on this topic, we predict the nature of the fluctuations in the 
spatial organizations of the ribosomes which throws light on the fluctuations in the size of ribosome clusters on a 
given mRNA transcript. 

In this paper we also suggest a new experiment for tes ting our theoretical predictions on the statistical properties of 
polysomes. Traditional technique of polysome profiling |l9l l20j provide the number of ribosomes bound to a mRNA, 
but not their individual position at the instant when translation was stopped by the experimental protocol. An 
improved version of this technique, called ribosome density mapping [2lj j . provides more detailed information on the 
numbers of ribosomes associated with specified segments of a particular mRNA by carrying out site-specific cleavage 
of the mRNA transcript. The results obtained using these techniques are often adequate for getting a qualitative 
indicator of the translational activity. However, the ribosomes are not expected to be uniformly distributed on a 
mRNA because of the stochasticities in the steps of the mechano-chemical cycles of these cyclic machines. These 
stochasticities arise from (i) intrinsic fluctuations in biochemical processes at low copy numbers of the molecules, and 
(ii) extrinsic fluctuations arising from the sequence inhomogeneity of the mRNA. The most detailed picture of the 
translational activity has been obtained by a recently developed technique, called ribosome profiling [H, HJ. For testing 
some of our theoretical predictions, the older technique of polysome profiling is adequate whereas for testing the other 
new results ribosome profiling would be necessary. 

This paper is organized as follows: we introduce our model and write down the master equations for the stochastic 
kinetics of this model in section II. In section III we solve the master equations in the steady state under periodic 
boundary conditions to calculate the overall rate of protein synthesis. The results demonstrate the effects of steric 
hindrance caused by congestion ribosome traffic. The spatio-temporal organization of the ribosomes in different 
parameter regimes correspond to the different non-equilibrium phases on the "phase diagrams" which we plot in section 
IV. The instantaneous spatial distribution of the ribosomes on a single mRNA is also characterized in terms of some 
quantitative measures which we introduce in section V where we also explore the effects of sequence inhomogeneities 
of mRNA. Finally, in section VI, the main results are summarized and important conclusions are drawn. 

II. MODEL 

The kinetic models of translation can be divided into three categories. Translation is just a single step in the 
broader context of gene expression. However, in most of the kinetic models of gene expression [22j, the details of 
the mechano-chemistry of individual ribosomes as well as their mutual steric interactions are ignored. The rates 
of synthesis and degradation of proteins are captured usually in these models by two rate constants without any 
mechanistic details of these two processes. We arc not concerned with a global picture of gene expression in this paper 
and, therefore, such kinetic models will not be discussed further here. 

There are models of translation which are intended to describe various key aspects of the stochastic mechano- 
chemical kinetics of only a single ribosome. In contrast, another class of models of translation is motivated by the 
polysome formation. Most of these models capture the effects of entire mechano-chemical cycle by a single parameter. 
These models focus mainly on the effects of mutual steric interactions of the ribosomes on the overall rate of protein 
synthesis. In this section we develop a model by capturing both these aspects of translation, namely, details of single- 
ribosome mechano-chemistry and the effects of steric interactions among the ribosomes on the same mRNA transcript. 
However, for the convenience of comparison of our work with earlier works, we summarize the main features of the 
TASEP-type models in the next subsection. 

A. TASEP-type models 

Totally asymmetric simple exclusion process (TASEP) [23|, [24[ is one of the simplest models of interacting self- 
propelled particles; it is used extensively for understanding the generic features of non-equilibrium steady-states of 
interacting systems. TASEP and its various extensions exhibit interesting dynamical phase transitions [251 ] . For many 
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years, various biologically motivated extensions of TASEP [26l436l | have been used to model ribosome traffic. In the 
TASEP-based models of ribosome traffic (see ref. [34| for a recent review) each lattice site represents a single codon. 
Since a ribosome is much larger than a single codon, each ribosome is represented by a hard rod that covers I (£ > 1) 
sites simultaneously. But, the allowed step size of a rod is one lattice site (i.e., one codon). This extended version of 
TASEP for hard rods will be referred to as I- TASEP. As long as a site remains covered by a ribosome, it is inaccessible 
to the other rods. The entry of a rod from one end (at a rate a) and its eventual exit from the other end (at a rate 
(3) model the initiation and termination stages of translation by a ribosome. 

The steps of the mechano-chemical cycle of individual ribosomes during the elongation stage were not captured 
explicitly in the simple TASEP-type models; instead, one single "hopping" parameter was used to describes the rate 
of translation of one codon. Moreover, these TASEP-type models neither incorporate any mechanism for selecting 
specifically the correct amino acid monomer, nor do these allow for the possibility of translational error. Therefore, 
such TASEP-type models are too simple to account for the effects of various mechano-chemical processes on the 
statistical properties of polysomes. 

B. Our model: unification of single-ribosome mechano-chemistry and TASEP 

In recent years, progressively more realistic models of translation have been developed [H, HH, [3TI - 439I ] and several 
analytical results have been derived. Using the most recent version of this model @, some statistical properties of 
single ribosome have been derived analytically Q . Here we extend this model even further to capture some features 
of translation which were not included in its earlier version. Using this extended version of our model of translation, 
we make experimentally testable predictions on the dependence of the statistical properties of the polysomes on the 
various mechano-chemical processes involved in translation. 

Each ribosome consists of two subunits which are designated as "large" subunit and "small" subunit, respectively. 
The translation of the genetic message encoded in the codon is carried out by the small subunit while the elongation 
of the polypeptide, by the formation of a peptide bond between it and the incoming amino acid, takes place in the 
large subunit. The function of the two subunits is coordinated by the tRNA molecules. There are three binding sites 
for a tRNA on each ribosome; these sites are designated as E,P and A. An incoming tRNA binds with an A site. 
The amino acid carried by a tRNA is linked to the growing polypeptide by a peptide bond while the tRNA is bound 
to the P site. Finally, the denuded tRNA exits from the ribosome from the E site. During the elongation stage, 
in each complete mechano-chemical cycle, the ribosome steps forward on the template mRNA by one codon while, 
simultaneously, the polypeptide gets elongated by one amino acid. These processes are captured explicitly in the our 
kinetic model. 

The distinct mechano-chemical states in our model and the allowed transitions among these states are shown 
schematically in figHJ At the beginning of each cycle, the system is in state I where the sites E and A are empty 
while the site P is occupied by a tRNA that has just contributed its amino acid to the growing polypeptide. A tRNA 
charged with an amino acid is called a aminoacyl tRNA (aa-tRNA). At this stage, an aminoacyl tRNA, bound to 
an elongation factor Tu (EF-Tu) and a molecule of Guanosine triphosphate (GTP) enters and binds with the A site 
on the Ribosome at the site A. This process takes place with rate uj a which causes transition of the system to the 
chemical state 2. Thereafter non-cognate tRNAs are rejected, and the system reverts back to state 1, with rate w r ±, 
through a quality control mechanisms based on the free energy of codon-anticodon matching. 

However, the free-energy difference between the cognate and near-cognate tRNAs is too small to distinguish between 
them. Therefore, usuall y, n ear-cognate tRNAs are not rejected at this stage. A second stage of quality control, called 
kinetic proofreading [40|,|41|, is then activated. GTP, which is bound to the aa-tRNA, is hydrolyzed to GDP by EF-Tu 
and this process is described by the transition from the state 2 to the state 3. At this stage, barring a few exceptional 
cycles, the near cognate tRNAs are rejected from chemical state 3 which drives the system back to the chemical state 
I ; this happens with rate constant uj r 2- 

Although most often the noncongnate and near cognate tRNAs are rejected by the two-stage selection process, still 
occasionally the quality control system fails to reject an incorrect (non-cognate or near-cognate) tRNA. Consequently, 
there is a small, but non-vanishing probability, of a translational error when the growing polypeptide elongates by 
the formation of a peptide bond with an incorrect amino acid. In our model, the incorporation of incorrect amino 
acid leads to a branched pathway: in contrast to the transition 3 — > 4 along the correct pathway, the wrong pathway 
proceeds by the transition 3 — » 4*. Arrival of another elongation factor called EF-G, alongwith a molecule of GTP also 
takes place at this stage. The transition 4 -— !• 5 (or, 4* —> 5*) is reversible and essentially a Brownian rotation of the 
two subunits relative to each other. This spontaneous Brownian rotation drives the two tRNA molecules back-and- 
forth between the classical P/P, A/ A state and the hybrid E/P, P/A state 42]. The rate constants for the forward 
and backward Brownian rotations are denoted by wt,/ and ujbr, respectively, along the correct pathway whereas the 
same transitions along the wrong pathway take place with the rates fif,/ and fi& r , respectively. Finally, hydrolysis of 
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GTP by EF-G drives the process of translocation at the end of which the two tRNA molecules are positioned at the 
E and P sites while the ribosome finds itself poised to translate the next codon; the denuded tRNA molecule makes 
an exit from the E site. The transition 5 — > 1 and 5* — > 1 take place with the rates ioki and tth2, respectively. The 
completion of the full cycle elongates the protein by one amino acid (by correct amino acid along one pathway and by 
an incorrect amino acid along another branch) and translocates the ribosome by one codon on the template mRNA 
(For further details, see ref . |42j| ) . 

Since our model allows the possibility of translational error, we define [181 ] the fidelity (f> of translation by the fraction 
of the incorporated amino acids which are correct, i.e., 

4> = oj p /(uj p + ftp) (1) 

In our model, the mRNA track is represented by a one-dimensional lattice where each of the total L sites corresponds 
to a single codon. The length of a ribosome is denoted by £ in the units of the length of a single codon. A ribosome 
can move forward by only one site (i.e., one codon) at a time. We use the convention that the leftmost site covered 
by a ribosome is the one that is being translated by it; the leftmost site covered by a ribosome is also used in our 
formulation to denote the position of a ribosome. Thus, throughout this paper, we follow the convention that, at any 
instant of time, a ribosome "covers" £ sites but "occupies" only the leftmost of these I sites. 

According to our notation, the status of coverage of a site is denoted by and 1; represents an unoccupied lattice 
site whereas 1 represents a covered site. Since many ribosomes move simultaneously on the same track they also 
interact with each other. The simplest form of interaction would be mutual exclusion: if the ith site is "occupied" 
by one ribosome, then all the £ sites from i to i + £ — 1 are "covered" by it and, therefore, none of the these £ sites 
are accessible to any other ribosome at that instant of time. Moreover, a ribosome occupying the position i can 
move forward if, and only if, the site i + £ is not simultaneously occupied by another ribosome. In our notation, the 

symbol P( l 1 10) represents the conditional probability that, given an uncovered site, there will be successive 

I 

I 

£ adjacent sites to its left all of which are covered simultaneously by a single ribosome. Similarly, P( l 1 | 0) 

is the conditional probability of finding a empty site j, given that the successive £ sites on its left are covered by a 
ribosome. 

Using the same notation, we now define Q(i\i + £) as the conditional probability that the site i + £ is not occupied 
by another ribosome, given that the site i is occupied by a ribosome, Similarly, given that the site i is occupied by a 
ribosome, the probability that the site i — £ is not occupied by another ribosome is given by the conditional probability 
Q(i ^ R is straightforward to show that [38l.l39l] 

Q = (l-p£)/(l + p-p£) (2) 

Where p = N/L is the number density of the Ribosome on the mRNA track. Followingthe same prescription that 
one of us (DC), and his collaborators, used in earlier simpler models of ribosome traffic [38l. |39|. we multiply the rate 
constants lu/- i2 and Qh2 by Q because a ribosome "feels" the mutual exclusion only when it tends to move forward to 
the next codon. 

By the symbol P^{i, t) we denote the probability of finding the ribosome in the /ith chemical state while it occupies 



the site i at time t. The master equations for the probabilities P p (i,t) are 

dP!{i,t)/dt = -u a P 1 (i,t) + u Jrl P 2 (i,t)+u Jr2 P 3 {i,t)+Lo h2 QPr J (i-l,t) + n h2 QP;(i-l,t) (3) 

dP 2 (i,t)/dt = w„Pi(i,t) - Ki+m)P 2 (M) (4) 

dP 3 (i, t)/dt = u hl P 2 (i, t) - (uj p + il p + uJ r2 )P 3 (i, t) (5) 

dP±(i,t)/dt = uj p P 3 (i,t) - u) b fPi(i,t) + u br P 5 (i,t) (6) 

dP 5 (i,t)/dt = u bf P 4 (i,t) - (u h2 Q + uj br )P 5 (i,t) (7) 

dPi(i,t)/dt = n p p 3 (i,t) - n bf p:(i,t) + n br p*(i,t) (8) 



dPg(i,t)/dt = n bf PZ{i,t) - {n h2 Q + n br )p^{i 7 t) 



(9) 
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Equations ©-([T]) correspond to the equations (52)-(56) of ref.[3!|. However, there are some additional terms in (|3])-([7]) 
because of (i) the kinetic proofreading, (ii) translational because of wrong amino acid selection, and (iii) reversible 
nature of the transition between the states 4 and 5. Moreover, the precise interpretations of the respective states 
were slightly different in ref. [39| (see ref. [39| for the detailed interpretations of the states and transitions between the 
states). 

Note that the normalization condition for the probabilities is 



Y J P^ht) + p* i {i 1 t) + p;(i,t) = P 



(10) 



Molecular mechanisms that lead to mistranslation have been under intense investigation for decades (see [43j for a 
recent review). However, to our knowledge, none of the earlier models, including that developed in ref. [391] . provides 
a mathematical framework to treat the mechanisms of mistranslation analytically. 

III. RATE OF PROTEIN SYNTHESIS: EFFECTS OF HINDRANCE IN RIBOSOME TRAFFIC 

We solve the equations ©-© in the steady state under normalization condition (ITU1) and imposing periodic bound- 
ary conditions (PBC). Using the steady-state solutions for we get the following expression for the steady-state 
flux 



Jpbc = {P^m + P^m)Q = pKeff 1 + (Op/^p) 



(11) 



where 



K ef 1 f = UJ a 1 ^1 + (Wrl/WhlM ^1 + {bJ r2 /bJp)^j + U) h l ^1 + (uj r 2/ujp)j + iO p 1 + U) b f ^1 + (u>br/Uh2Q)j + (^fc.2<3) 1 



+ (rip/up^ oj a 1 ^1 + (wri/whi)^ + to h l + ci b j ^i + (n br /n h2 Q)j + {n h2 Q) 



Separating out the Q-dependent and Q-independent parts, K e ff can be re-expressed as 

r—l 



(12) 



(13) 



where 



and 



fc 1 1 = LU a 1 ^1 + {U) r x/Uhx) \ f 1 + {Urt/Up)) + <*> h i + {UrVt/Up)\ + UJ p 1 + U> b f 

+ ^Qp/ujp^j lo^ 1 ^1 + (avi/w w )J + + tt b f 



(14) 



(15) 



Note that when simultaneously uj r2 — ► 0, fl p — ¥ 0, cu br — ¥ 0, the expressions for k\ and k 2 reduce to K e ff and u>h,2, 
respectively. 

So the J pbc is given by 



Jpbc — 



k 2 p(l -p£)[l + (fip/wp) 



l + (k 2 /k l )\(l-p£) + p 



(16) 



Since no premature detachment of ribosomes are allowed in our model, the flux of the ribosomes is also the total rate 
of protein synthesis. In the special case io r2 — uj br = fij, = 0&/ = fl br = fl} l2 = the expression (|16|l reduces to 
the corresponding expression for flux derived in ref. [39l] . The expression (|16p for Jpbc(p) exhibits a single maximum 
which occurs at the number density 



1 + (k 2 /h) 



l + Je[l + {k 2 /k 1 ) 



(17) 



The equations ([15)1 and (fT7)) reduce, respectively, to the equations (58) and (63) of ref. [3{|. 
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IV. NATURE OF POLYSOMES: NON-EQUILIBRIUM PHASE DIAGRAM OF RIBOSOME TRAFFIC 

In this section we impose open boundary conditions (OBC) which is more realistic for modeling translation. The 
entry of the ribosomes at one open end captures the initiation of translation while the exit of the ribosomes from the 
other open end mimics termination of translation. 



A. Phase diagram 

In a multi-dimensional abstract space spanned by some of the crucial model parameters, we identify the distinct 
regions characterized by their distinctive properties which we describe below. The resulting diagram is referred to as 
a "phase" diagram although the "phases" are not in thermodynamic equilibrium; these phases are non-equilibrium 
steady states of the system. The theoretical prediction we make in this section can be tested by using the technique 
of polysome profile [19|, [2Cj • 

Our calculations here are based on the extremum current hypothesis (ECH) |44T - |47j which relates the flux J , under 
OBC, to the flux JpBc(p) of the same system, under PBC. We apply the ECH to our model in the same way in 
which it was used earlier for the simpler versions of our model [38l 139 ] . We assume that the entrance and exit points 
of the track (i.e., the start and stop codons) are connected to two infinite particle reservoirs where the respective 
number densities are p- and p+, respectively. According to ECH, for systems with a single maximum in the Jpbc(p) 
function, 

J = max Jpbc(p) if P- > P > P+ (18) 

These relations can be utilized to draw the surfaces separating the dynamical phases on the phase diagram. The 
first step in this approach has already been completed by calculating the expression (fT7|> for p*. Next, we derive the 
expressions for p_ and p + which would give rise to the same rates of initiation and termination as indicated by a and 
/J, respectively. 

Suppose P 3 y mp (At) is the probability that, given an empty site, a ribosome will hop onto it from left in the next 
time interval At. It is straightforward to see that 

pi«m P(Ai) = P( l ,. l ]0)(P 5 q; fe2 + P*{l h2 ) x At (19) 

l 

where P5 and P 5 * are given by the expressions 

P5 1 = uhiiK 1 + K 1 ) (20) 

and 

P* = (Sl p /oJ p )P 5 , (21) 

respectively, and as discussed in ref . [39ij , 

P( l ■• \ \0)=p/(l + p-pl) (22) 

t 

Note that the solutions (f2T))) and (|2"Tj) have been obtained using the normalization condition 

5 

^P„+P 4 *+P 5 * = l (23) 

for the reservoir. 

Now p- is the solution of the equation a = p^ ump . Hence, 

p-=a(l + (k 2 /k 1 ))/ {e-^U + ika/k^a + P^U + iflp/oJp)) (24) 

with 



P fc2 = k 2 (At) 



(25) 
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relating the probability P\a with the rate constant k 2 where is given by (fl5|) . Similarly, 



pjump {At) = / m | )(P 5 o; M +P 5 *n h2 )A* 



where 



P(i, 



II 0) = (l-p*)/(l + p-p*) 



The unknown density p + is the solution of the equation j3 = Pi ; hence, 



P+ 



P l + (fc 2 Al) -flt2 l + (fi P /Wp) 



/3(£ - 1) 1 + (Aa/fci) - £P fc2 1 + (n P /w P ) 



(26) 



(27) 



(28) 



Note that the equations (j2"4")l and (|2"5jl reduce, respectively, to the equations (67) and (70) of ref.[39| in the appropriate 
limit. 



1. Surface separating LD and MC phases 

From MCH it follows that the surface separating the LD and MC phases on the phase diagram of the system is 
obtained from the equation 



P- = P* 



(29) 



by expressing p_ and in terms of the rate constants for the elementary steps of the model kinetics. Hence, the 
equation for this surface in the phase diagram of the model is found to be 



a 



p*Pk2 i + (rip/wp) 



i + (fe/*i) [i-(*-i)p* 



(30) 



2. Surface separating HD and MC phases 

Similarly, using the condition 

P+ = P* (31) 
the equation for the surface separating the HD and MC phases in the phase diagram of the system is given by 



P k2 {l-pJ) l + (Op/wp) 



l + (fca/*i) [l-(*-l)p* 



(32) 



3. Surface of coexistence of HD and LD phases 



which gives 

a= P k -2 (l + j ■'( l + ) 

Equivalently 



Jpbc{p-) = Jpbc{p+) (33) 



Pk2 1 + (fip/wp) )l + 0(1-1+ 2(k 2 /k 1 ) - efo/ki) + {kl/k\) (34) 



a£P k2 1 + (fip/wp) 



1 + (fc 2 /fci) ) \ Pk2 ( 1 + (Op/wp) )+a{i-l)- {k 2 /h)a \ (35) 
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Note the equations ([3D]). (|32j) . (f34|) and (J35J) reduce to the expressions (72), (74), (76) and (77), respectively, of ref.[39t 
in the appropriate limit. 

Figsl2]and[3]show the 3-dimensional phase diagrams plotted in the a — (3 — 4> space from two different perspectives, 
while figs0] and [5] show the 3-dimensional phase diagrams plotted in the a — ft — P Ur2 space also from two different 
perspectives. Since none of the earlier models of ribosome traffic capture translational fidelity and kinetic proofreading 
explicitly, these phase diagrams have not been reported ever before. 

In order to compare the implications of these phase diagrams with that of the TASEP, we also project several 
two-dimensional cross sections of these phase diagrams onto the a — /3-plane (see figs |5] and [7]). On the 2D projections, 
transition from the LD phase to the MC phase takes place at a = a» whereas the transition from the HD phase to 
the MC phase takes place at j3 — /?*. 

As is evident from these 2D phase diagramSjthe curvature of the lines of coexistence of HD and LD phases seems 
to be the generic feature of all such models [3^, HH . The straight line a = f3 on which the LD and HD phases of 
TASEP coexist is a manifestiation of the "particle-hole" symmetry in TASEP, a special property that is not shared 
by our model of ribosome traffic. 

Increasing <j) shifts a* and /?* to higher values. Increase of <j) — uj p / (fl p + uj p ) can be viewed as a result of increasing 
Wo which, in turn, increases the effective rate of hopping of a ribosome from one codon to the next. It is well known 
[49( 1 that higher values of effective hopping rate shifts a* and /3* to higher values. Similarly, increasing uj r 2 decreases 
the effective hopping rate thereby shifting a* and /3* to smaller values. 



B. Effects of recycling on the phase diagram 

Recycling of ribosomes can be captured by our model by replacing the constant initiation rate a by an effective 
initiation rate a e //. Since the availability of ribosomes for initiation is proportional to the flux of ribosomes exiting 
from the stop codon, we postulate that 

a eff = a + qj(a e ff,/3,{u>i}) (36) 

where the coefficient q depends on the relative separation between the two ends of the mRNA transcript as well as 
on the diffusion constant of the ribosome subunits in the solution. [32} . Note that the prescription (|36l) for recycling 
is similar in spirit, but not identical quantitatively, to the prescription used by Gilchrist and Wagner [50( because the 
latter model does not capture steric exclusion among the ribosomes. 

On simple physical grounds, the effect of (|36p on the phase diagram is expected to be non-trivial. Suppose, a is 
varied keeping /3 fixed. At a very small value of a (a <C /3), the flux would be determined by a. Starting from a very 
small value, increasing a initially increases J which, in turn, increases the effective initiation rate a e ff. However, 
beyond a certain value of a, a e ff is no longer rate limiting and the flux becomes independent of a. 

Exploiting the relation between our model and £- TASEP, we plot the phase diagram for the extended model that 
captures recycling of ribosomes through a e //. We use the results reported in refs . [29T [3^| for i-TASEP and get 

J = [a eff (l - a eff )]/[l + a eff (£ - I)] (37) 

a eff =oi + q([a eff (l - a eff )}/[l + a eff (£- 1)]) (38) 

which gives 

a eff = [a(£ - I) + q - I + y/(a(£ - 1) + q - l) 2 + 4(£ - 1 + q)a]/[2(£ —1 + q)] (39) 



So at LD and HD interface 
at LD and MC boundary 
and at HD and MC boundary 



P = a eff (40) 
a eff - 1/(1 + VI) (41) 

= 1/(1 + VI) (42) 

The resulting 2-d phase diagram in the a — (3 plane is plotted in figlD Increase of the recycling factor q shifts a* to 
a smaller value while /3* remains unchanged. This trend of variation is consistent with the fact that recycling affects 
only the initiation rate without influencing the rate of termination. This is consistent with the trends of variation of 
the coverage density and flux of ribosomes that we observed in our computer simulations which we descibe below. 
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1. Variations of flux and coverage density with the extent of recycling 



In order to demonstrate the effects of recycling in a form that would be closer to one's physical intuition, we now 
show the variation of the average flux and the average coverage density of the ribosomes with the parameter q which 
is a measure of the extent of recycling. For this purpose, we carried out computer simulations of our model for several 
different values of q. All the data reported here were obtained for L = 1000 and I — 10. Since the linear size of a 
ribosome is measured here in the units of codons, t — 10 is a realistic choice because in recent experiments 0, Q it 
has been observed that each ribosome covers about 30 nucleotides on the mRNA track. Typical values of some of 
the rate constants have been reported in the literature [HI, [52| . For those rate constants whose numerical values are 
not available in the literature, we have assumed some reasonable values based on physical intuition. However, our 
conclusions are not sensitive to the precise numerical values of the rate constants. 

The rate constants which we used for the simulations are cj q =25s , 
u;/ l i=25s~ 1 ,w/ l 2=25s~ 1 ,Wp=25s~ 1 ,Wb/=25s~ 1 ,Wb,.=25s -1 ,w r i =10s _1 ,a; r 2 =10s _1 ,f2 p =5s~ 1 ,f2; ) /=5s~ 1 ,u;; ) r=5s~ 1 ,ti;/ l 2=5s 
All the rate constants were converted to dimensionless transition probabilities using the formula P u = 
1 — exp(—uj * {At)) where time step At = 0.005s is used for all the simulation runs. 

At first sight, it may appear that there is some ambiguity in the defnition of a e ff : what value of J should be 
used in (|3T>1) ? In order to avoid ambiguity, we average the spatially-averaged flux further over the time period elapsed 
since the last entry of a ribosome (i.e., the initiation of translation by the ribosome closest to the start codon). This 
"doubly-averaged" value of flux J is used in to compute a e f / . 

In fig© we plot the flux and the coverage density of the ribosomes as functions of q. Starting from a vanishingly 
small value, as q is increased, both the flux and coverage density increase. However, beyond a limiting value, both 
become practically independent of q; this trend of variation is caused by a transition from the LD phase to either the 
HD phase or to the MC phase, depending on the values of the set of other parameters. 



C. Experimental tests of the phase diagram with polysome profile 

The three different phases are characterized by three different densities; the expressions for these densities have 
been derived above. Therefore, our theoretical predictions can be tested by measuring the average densities. For 
this purpose, polysome profiling [l9l . [2pj would be adequate. However, all the analytical calculations for the phase 
diagram have been carried out for sequence-homogeneous mRNA strands. Therefore, a poly-U strand of mRNA, with 
appropriate start and stop codons [3|, should be used in the experiment. 



V. INSTANTANEOUS SPATIAL DISTRIBUTION OF RIBOSOMES: RIBOSOME PROFILE 

In all the sections above we have calculated quantitative characteristics which do not require information on the 
spatial distributions of the ribosomes on the mRNA transcript. In this section we explore some other quantitative 
features of ribosome traffic which deal with the spatial distributions of the ribosomes. Our theoretical predictions on 
the spatial distributions of the ribosomes can be tested with the ribosome profiling technique [l|, HJ . 



A. Distance-headway distribution 

The distance-headway (DH) is defined as the spatial separation between two successive ribosomes on the same 
mRNA transcript. At any given instant of time, the magnitude of the DH fluctuates from one pair of ribosomes to 
another, the instantaneous spatial distribution of the ribosmes is characterized by the corresponding distribution of the 
DHs. The DH distribution is used extensively for quantitative characterization of macroscopic vehicular traffic [TTl.[l2l|. 
In this subsection we calculate the DH distribution for our kinetic model of ribosome traffic. Since ribosome profiling 
[H provides the exact positions of the ribosomes at the instant when translation was stopped, DH distribution can 
be extracted by repeatiting this profiling sufficiently large number of times. 

Our system may be viewed as one that consists of M identical rods, each of length £, distributed over a lattice of 
L sites. First, assuming a ring-like mRNA track we get the DH distribution for the corresponding number density p 



10 



which, because of the PBC, does not fluctuate. In this case, the expression for the DH distribution is given by [29| 

P dh (m,p) = (p/ Ps )(p h /p s r (43) 

where p/> = 1 — p£ is the density of holes and p s = p + p^ = 1 + p — pi. Hence, 

P dh (m,p) = [p(l - P C) m ]/[{l + p - p£) m+1 ] (44) 

The number density of the ribosomes for the real system under OBC is a fluctuating quantity. But, the mean density 
deep inside the bulk (around the central region of the lattice) can be extracted numerically from computer simulations. 
Substituting the numerically estimated density of the ribosomes under OBC into the expression (|44l) we get the DH 
distribution under OBC. This distribution is plotted in figfTni The straight lines on the semi-log plot reflects the 
geometric nature of the distribution (discrete analog of the exponential distribution). In order to test the validity 
of the approximate scheme used above to derive the DH distribution by a combination of analytical and numerical 
arguments, we have also computed the DH distribution directly by computer simulation; the simulation data are also 
plotted in fig llOl The theoretically derived lines are in reasonaly good agreemwnt with the DH distribution obtained 
by computer simulation. 



B. Influence of slow codons on the density profile and flux 



It is well known that translation of some codons take place at a very slow rate; these are often referred to as 
"hungry" codon. However, we'll use the term "slow" to refer to the all those codons which get translated at a much 
slower rate than other codons. In this subsection we explore the effects of bottlenecks created by such slow codons 
against the forward movements of ribosomes. In particular, we investigate the effects of slow codons on the average 
density profile and flux of ribosomes in ribosome traffic. 

In the model that we simulated for this purpose, a mRNA transcript consists of only two different types of codons. 
In the computer simulations of our model we assign ten times smaller numerical value to to a for a slow codon compared 
to that of a normal codon. Simultaneously, the numerical value of w r i assigned to a slow codon is ten times larger 
than that of a normal codon. Since in this particular study we are interested mainly in the effects of bottlenecks, we 
ignore the possibility of misincorporation by erasing the branched pathway putting Q p = = Qh2- 

In the first set of simulations, we put four slow codons at the center of the stretch of mRNA between the start 
and the stop codon. Such a single extended bottleneck leads to a "phase-segregated" profile where on one side of the 
bottleneck the average density is much higher than that on the other side (see fi efTTj) . Profiles are plotted for different 
values of w r 2- Higher value of ui r 2 reduce the effective hopping rate for both normal as well as rare codons. But, the 
effective hopping rate for slow codon decrease more because of higher value of uj r \/uihi- Thus, the higher is the value 
of uj r 2, the larger is the difference between the effective rates of hopping from normal and slow codons. This, in turn, 
leads to the larger jump discontinuity of the density across the bottleneck at a larger value of uj r 2- 

In the second set of simulations, the slow codons were not clustered together. Instead, four equispaced slow codons 
were placed at the sites 200, 400, 500, and 800 on a lattice of total length L = 1000. The average density profiles for 
this case are plotted in fig JT^] for two different values of uj r 2- Both the profiles exhibit discontinuous jumps in the 
coverage density; the position of each minimum in the coverage density coincides with the location of a slow codon. 
Moreover, periodic oscillations are also observed in the vicinity of the the rare codon where periodicity is I. Similar 
results were obtained earlier in TASEP-type models of ribosome traffic |34j |; however, unlike our data, shown in fig |12[ 
the effects of kinetic proofreading and futile cycles could not be addressed by the model of ref. [34| . 

In order to emphasize the effect of clustering of slow codons on the overall rate of protein synthesis we plot flux as 
a function of lj T 2 in fi stl3l for the two different conditions discussed above. The upper curve corresponds to the setup 
where four slow codons are placed equidistant on lattice. The lower curve corresponds to the setup where all the four 
slow codons are placed clustered together at the center of the system. The data clearly show that without increasing 
the number of slow codons the rate of synthesis of proteins can be reduced drastically by clustering the slow codons 
into a single bottleneck. 



1. Probing spatial distribution of ribosomes 

Ribosome profiling technique [1, 2] is ideally suited to probe the instantaneous spatial distribution of ribosomes 
on the same mRNA transcript. But, our model does not take into account the variation of rate constants arising 
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from sequence inhomogeneity of the mRNA transcript. Therefore, at first sight, it may appear that a homogeneous 
sequence (e.g., a poly-U) would be most appropriate transcript for testing our theoretical prediction. However, the 
technique of ribosome profiling [l], 0] cannot locate the exact positions of the ribosomes on a sequence- homogeneous 
mRNA transcript. Therefore, we suggest that the experiment should be performed with a special-type of sequence- 
inhomogeneous mRNA transcript where, because of the intrinsic degeneracy of the genetic code, all the codons are 
synonymous, i.e., correspond to the same amino acid. Only the cognate tRNA molecules carrying the correct amino 
acid are to be supplied to the solution. We do not expect significant codon-to-codon variation of the rate constants in 
this case. For studying the effects of translational fidelity and proofreading, tRNA molecules carrying a non-cognate 
species of amino-acids should also be supplied. 

In case it turns out to be difficult to extract the exact positions of the ribosomes from ribosome profiling of such 
a mRNA strand where the degenerate codons are distributed randomly, we suggest an alternative strategy. Recall 
that six synonymous codons correspond to the same amino acid Arg; similarly there is six-fold degeneracy also for 
the amino acids Leu and ser (see fig |14f a)). In principle, one can synthesize an artificial mRNA transcript of the type 
shown in fig [T47 b) using six synonymous codons from any of the three possible sets shown in figlHta). A mRNA 
transcript with such a non-random inhomogeneous codon sequence allows unambiguous identification of the positions 
of the ribosomes on it while using the ribosome profiling technique [l[ . 



VI. SUMMARY AND CONCLUSION 



In this paper we have developed a theoretical framework that captures several key features of translation as well as 
the spatio-temporal organization of polysomes. First, the selection of aa-tRNA in our model is a two-stage process; the 
second stage captures kinetic proofreading. Second, our model allows occasional translational error and we calculate 
several quantities as functions of the translational fidelity. We have also incorporated some of the other features of 
the mechano-chemical cycle of ribosomes in the elongation stage which, to our knowledge, have not been incorporated 
in any earlier model. On the basis of our hypothesis for capturing the effects of ribosome recycling, we have predicted 
the effects of recycling on the spatial profile of the ribosomes as well as on the rate of protein synthesis. 

Here we have also investigated the spatial organization of ribosomes in polysomes in terms of the distance-headway 
distribution; it is a quantity that is used extensively to characterize crowding in vehicular traffic. We have also 
identified the parameter regimes which display distinct characters of polysomes and the corresponding rates of protein 
production in our model system. Finally, we have also demonstrated the effects of sequence inhomogeneity of the 
mRNA transcript, particularly, that of the clustering of slow codons. 

We hope this work will inspire experimental investigation for measuring new quantities. It is possible to test some 
of our new predictions using polysome profile techniques. But, more interesting results on spatial distributions of the 
ribosomes on the mRNA transcript would require ribosome profiling 0, Q • 
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Figure Captions 

Fig.l: Detailed mechanochemical cycle of Ribosome on its track. The integer indices — + 1... label the 
codons on the mRNA transcript. Although the same set of transitions are allowed from each codon, only those from 
(and to) the codon i are shown explicitly. 

Fig. 2: Phase diagram of ribosome traffic model in the 3-dimensional space spanned by a,/3 and <f>. 
Fig. 3: Same data as in figlH except that plotted from a different perspective. 

Fig. 4: Phase diagram of ribosome traffic model in the 3-dimensional space spanned by a,/3 and P Wt . 2 . 
Fig. 5: Same data as in figlU except that plotted from a different perspective. 

Fig. 6: Projections of several two-dimensional cross sections, of the three-dimensional phase diagram, plotted in figsf2] 
and [31 onto the a — /3 plane. Each cross section corresponds to a fixed value of <f>. 

Fig. 7: Projections of several two-dimensional cross sections, of the three-dimensional phase diagram, plotted in figs|4] 
and[5l onto the a — ft plane. Each cross section corresponds to a fixed value of uj r 2- 

Fig. 8: 2D Phase diagram of TASEP and £-TASEP in the a — (3 plane in the presence of recycling of ribosomes. 

Fig. 9: Variation of the average flux and coverage density with the recycling factor q. The green and red curves have 
the same a values whereas the red and blue curves have the same (3 values. 

Fig. 10: Distribution of distance-headways. The lines have been obtained by using the formula (|4"4"|) whereas the 
discrete data points have been obtained directly from computer simulations. 

Figll: Density profile of ribosomes for single bottleneck. 

Fig. 12: Density profile of ribosomes for slow sites at i = 200, 400, 600, 800. 

Fig. 13: Flux variation with us r 2 for both lattices (see text for detail). 

Fig. 14: Codon-sequence suggested for testing our theoretical predictions. 
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